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Abstract 

We extend some classical theorems in the theory of orthogonal polynomials on the unit circle to 
the matrix case. In particular, we prove a matrix analogue of Szego's theorem. As a by-product, we 
also obtain an elementary proof of the distance formula by Helson and Lowdenslager. 

1 Historical background and motivation 

The classical work |15j of Szego was the first to address the asymptotics of orthogonal polynomials on 
the unit circle T under the assumption that the entropy of the underlying measure a is finite, i.e., 



/ or > 



IT 

Further aspects of Szego's theory were developed by Geronimus, Verblunsky and others, which led to 
a number of other formulas, in various setups, involving the entropy such as the formula of Helson- 
Lowdenslager [5] for multivariate random processes (for a historical account, see [2] §1.1]). 

Verblunsky |16l formulas (v) and (vi)] showed that, for any probability measure a on the unit circle 

T, 

lim[](l-| afe | 2 )=exp / loga'— . (1) 

fc=o ^ T 7r 

Here {a,k}k>o is a sequence of points in the unit disc D called the parameters of a [TT| §8.3] and a' = 
lixda / dO is the Lebesgue derivative of a. The numbers {ak}k>o have different names depending on the 
area where they are considered. In the theory of orthogonal polynomials they are known as the Szego 
recurrence coefficients, Verblunsky parameters, Geronimus parameters, in Schur's theory they are Schur's 
parameters, in inverse scattering problems they are reflection coefficients, see |141 §1.1]. 

In the matrix setting, a is a Borel measure on T with values in the set of all nonnegative definite 
matrices in M.n , the set of all £ x I matrices with complex entries. We denote by P^(T) the set of all 
matrix- valued nonnegative measures a on T that are normalized, i.e., 

<r(T) = 1 

to the unit matrix 1 in M.^ ■ We refer to P^(T) as the class of matrix probability measures. 

The matrix case is important in multivariate Time Series and Prediction Theory [7J |S1 1121 1131 117] . 
As far as we know, the first Szego-type results on matrix- valued orthogonal polynomials were obtained 
by Delsarte, Genin and Kamp [4]; this line of research was continued by Aptekarev and Nikishin [2]. 
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Our method is a combination of the recent theory of matrix orthogonal polynomials presented in 
[5] and the approach to Szego's theory developed in [101 ITT] . This combination allows us to avoid 
using factorization theory in matrix Hardy classes. Instead, we use only methods of Real Analysis and 
Matrix/ Operator Theory. 

Our main goal is to respond to the following challenge of Damanik, Pushnitski, and Simon [5]: 

Among the deepest and most elegant methods in OPUC are those of Khrushchev [125, 126, 101]. We 
have not been able to extend them to MOPUC! We regard their extension as an important open question... 

Below we provide a full matrix- valued version of Szego's theorem, yielding the previously known trace 
versions as corollaries of our matrix formula. 

Throughout the paper, we mostly follow the notation and terminology of [5]. 

2 Main results 

In the matrix case, the parameters ctk are matrices in A4e with norms ||afe|| not exceeding 1. Here ||a|| is 
the norm of the linear operator defined by the matrix a subordinate to the usual Euclidean vector norm 
(2-norm) on C^. This operator norm is also known as the spectral or the Euclidean norm. This norm 
is well known to equal the largest singular value of the matrix a; in particular, if a is self-adjoint, the 
norm ||a|| equals the spectral radius of a. 

We denote by the Hermitian conjugate of a € Me ■ The symbol * is reserved for the Szego dual, 
so we do not use it for the adjoint (see (0}). 

We assume that the matrix 

/ p(e*)W(0)p(e tf ) (2) 
Jt 

is positive (definite) for any polynomial p with coefficients in M.i- Condition © is equivalent to the 
requirement that da has full rank £ for infinitely many points in T. Under this condition, the right 
(left) orthogonal matrix polynomials tp„ (y^) are uniquely determined by the standard Gram-Schmidt 
orthonormalization. It is important to notice that the left orthogonal matrix polynomials are obtained 
with respect to the left quadratic 'form': 

f p(e* e )da(6)p(e*y . (3) 

JT 

Every a € Pf(T) is uniquely determined by the sequence of its parameters {ctfc}fc>o- These parameters 
are contractive matrices in A4e- If tr is a matrix-valued measure with parameters {ctk}k>o, then the 
parameters {a]_}fc>o correspond to the measure a such that 

a(E)=a(E), E = {z:zeE}, 

for any Borel set E, where z stands for the complex conjugate of a complex number z. We write <p n {z, o~) 
for the orthogonal polynomials if the dependence on a is important. 

For a matrix polynomial P n of degree n, we define the reversed (or Szego dual) polynomial P* by 

P:(z) = z n P n (l/z?. (4) 

The relationship between the left orthogonal polynomials <p^ and the right orthogonal polynomials (p^ 
is given by the formula 

^(e*V)=^(e-' ; V)t (5) 
(see Corollary 1161) . The nth left normalized orthogonal polynomial (p^(z,a) depends on the parameters 
aj, a\, . . ., ajj_!- Hence, the nth right polynomial (p^(z,o-) can be obtained from the left ip^(z,a) by 
replacing each a k by at, replacing z G T by z and applying the conjugation f. 
The main result of this paper is the following theorem. 

Theorem. Every matrix probability measure a S Pe (T) satisfies the following matrix equality: 

hm f\ og ([^{e*y^{e^)f- = [ togaf . (6) 
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If the parameters otk of a form a family of commuting normal matrices, then © can be simplified to 

fj (l-a fc 4) =expjf log <r'|^. (7) 



fe=0 



Remark. Alternatively, the commuting case reduces to the diagonal and hence to the scalar case. 

Regardless of the normality or commutativity of {afe}, the following determinantal-trace version [3] 
follows from ([5]): 



JJ dct (l - c*fc4) = CX P / trlogu' 

fc=0 ^ T 



(8) 



The symmetry z 1— > z keeps the Lebesgue measure on T invariant. Hence, combining ((5]) and a simple 
formula 

Jt 2-7T J T 27r' 

we also obtain the left version of ©: 

lim riog([^'*( e ^)^'*(e' e )t]-^)^= /loga'^. (9) 



3 Matrix preliminaries 

Recall that we denote by Aii the ring of all t x I complex- valued matrices, its identity matrix by 1 
and its zero matrix by 0. Along with the Euclidean norm || ■ || on Mi, we also consider the trace norm 
|a|i = tr(a^a) 1 / 2 and the Hilbert-Schmidt norm \cx\2 — (tr(a^a)) 1 / 2 . It is easy to see that 

IN <N 2 <Hi <t\\a\\. (10) 

We say that a self-adjoint matrix A{~ A^) G A4e is nonnegative (positive) if the corresponding 
quadratic form x M- Ax is nonnegative definite (positive definite). We denote the class of all non- 
negative self-adjoint I x I matrices by . The corresponding partial order is known as the Loewner 
ordering and is denoted by A>~ B means that A — B is positive, i.e., A — B >- 0, and A >: B means 
that A-B^ 0, or A-Be Mj. 

Here is the first fact about the Loewner ordering that we will use later: 

Lemma 1. Let ^ Aj d B 3 for j = 1, . . . fc. Then O^AH h4^BH h B k . 

Proof. Evaluate and compare the quadratic forms of both sums. □ 

We will also need the following result connecting traces of self-adjoint matrices and their Loewner 
ordering: 

Lemma 2. Suppose A>B and tr A = trB. Then A = B. 

Proof. By the linearity of traces, this is equalent to the statement: Suppose A >z and tr A = 0. Then 
A = 0. The latter follows from the fact that tr A = Ylj=i e ]Aej, so if the trace of A is zero, the action 
of A on all standard unit vectors (hence on the entire space) must be trivial. □ 

Another fact about traces we will need is the following: 

Lemma 3. If A >~ 0, then logdet(A) = tr(logA). 

Proof. Without loss of generality, A is a diagonal matrix with positive diagonal elements, since the 
formula is invariant under unitary similarity. But then log A is the diagonal matrix whose elements are 
the logarithms of the diagonal elements of A. The conclusion of the Lemma is thus straightforward. □ 
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Corollary 4. Let A u . . A n y and let A := A\ • • • A n >- 0. Then 

n 

trlog(Ai---A„) =logdet(Ai---A„) =^tr(logAjfc). 



fc=i 



Proof. Apply Lemma [3] to the product A = A\ ■ ■ ■ A n . □ 

Finally, we will need the following interesting characterization of the determinant via the trace (also 
used by Helson and Lowdenslager in [8], also see, e.g., [9l Exercise 19, p. 486]): 

Lemma 5. Let A be the set of all matrices in M.t with determinant 1. Then every positive matrix C 
satisfies 

mf 4 itr(ACAt) = [det(C)] 1 /f (11) 

Proof. Let U be a unitary matrix such that C — UDU* , where D is the diagonal matrix with eigenvalues 
Ai, . . . \g. Then A = det(C7) 6 T. It follows that AC A^ = {A\U)D(A\U)\ implying that we may assume 
without loss of generality that C — D. Then 

ti-(ADA^) = AxIM 2 + A 2 ||a 2 || 2 + • • • + X e \\a e || 2 , 



where denotes the fcth column of A. By the arithmetic-geometric mean inequality, 
Ai||aiH 2 + A 2 ||a 2 || 2 + --- + A,|H| 2 ef - — —~ 2 ---^ 



(12) 



By Hadamard's inequality Inequality 7.8.2], 

||oi||.--||ai||>det(^) = l. (13) 
The equality in (|12l) occurs if and only if 

Ai||a 1 || 2 = --- = A,||a,|| 2 . 

The equality in (|13p occurs if and only if the columns form an orthogonal system in C^. It follows 
that the equality in (lll|) is attained for the diagonal matrix A with a\ 1 , . . . , a\ e on the diagonal. 
Here a is chosen so as to make the determinant of A equal 1. □ 

4 Matrix measures 

A matrix- valued nonnegative measure fi on the unit circle T is a countably additive mapping of the Borel 
cr-algebra *B(T) on T into the set M.~\ of all nonnegative I x i matrices /i : B M> /j.(B) 6 J\A\ ' . It follows 
that for any E 6 «8(T) 

< KE) < n{E) + /i(T \ E) = fi(T). 

Then v{E) = ^(T)- 1 / 2 /i(£ , ) / Lt(T)- 1 / 2 is also a matrix-valued nonnegative measure which is called the 
normalization of /i. As before, we assume that /i is normalized: /i(T) = 1. 

Recall that P^(T) denotes the set of all matrix probability measures, i.e., the normalized matrix- 
valued nonnegative measures on T. Let {ej} l j =1 be the standard basis in C . Then for every E S *B(T) 
we obtain the matrix of fi(E) 



fJ,2l(E) j(i 22 (-E) ••■ lX2l{E) 



(14) 



M (£) = 

\m(E) m(E) ■■■ nu{E)) 

Since = tr(a) for every a s M.~\ ', we see that 

\^(E)\ = ME)ej,ei)\ < ||/x(£)|| < |/i(£)| x = tr(/x(£)). (15) 
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It follows that the entries (E) of /j,(E) are finite complex measures on T which are absolutely continuous 
with respect to tr(/i). Thus any element /i of Pe(T) is nothing but a table of measures (fH)) subject to 
positivity conditions and domination by the tr(^t). We say that fj, £ P^(T) is absolutely continuous 
(discrete, singular) if so is its trace measure. It follows that any /i G Pf (T) can be uniquely decomposed 
into the sum 

fX = H a + fl d + Ms, (16) 

where fi a is absolutely continuous with respect to the Lebesgue measure d6/(2w), \l& is the discrete part 
of fj, and [i s is its singular part. Indeed, taking the Hahn-Lebesgue decomposition of the trace measure, 
we can associate three matrix- valued measures with it. Namely, the entries of /x are the absolutely 
continuous parts of /iy with respect to tr(/x) 0) and similarly the enties of \Xd and /i s for the discrete and 
singular parts, respectively. Since Borel supports of tr(/z) a , tr(/z)d, tr(/z) s can be chosen to be disjoint, 
the positivity of the corresponding matrices follows immediately. Moreover dfi = M(/z, f)tr(<i/i) where 
M(/x, C) G Mj for C G T. 

The measure /k q can be found by Lebesgue's differentiation theorem: 

fi'{e ie ) = lim 4^ a.e. on T, (17) 

e->o+ 2e 

where I e denotes the arc of length 2e on T centered at e lB . Then 



M£) 



/ mV*)^ 
Je 27r 



We say that a sequence {//"- ) }n>o m P^(T) converges to fi G P^(T) in *-weak topology if 

*-lim4™ ) =/xy (18) 

n J 

for every pair of indices For our class Pf(T), we need matrix analogues of two Helley's lemmas as 

they are stated in [111 Lemma 8.5, Theorem 8.6] for scalar measures: 

Theorem 6. If *-lim„/jW = /j in P £ (T), then 

lim M ^(7) = M (I) (19) 

n 

for any open arc 7 on T such that [i vanishes at the endpoints of I. 

Proof . Let x be an arbitrary fixed column- vector in C , as usual, x^ denotes its conjugate transpose 
(row- vector) . Let t i— > /(i) be a nonnegative continuous function with values in [0, 1] supported on an 
open arc /. Then 

xV (n) (7> = JxU^ n \t)x > Jj{t)xU^ n \t)x. (20) 

By (S) 

lim / /(t)a; t V n) (t)^ = hm / /(i) ^ xld^Xj = ^ lim / /(t^d^fxj = f f(t)xUfi(t)x. 

1 U J* 1,3 = 1 1,3 = 1 " 

This and flU imply 

lim sup xV"'^ > liminf a;V n) (i> > sup / f(t)xUfi(t)x = [ xUfi(t)x = x^(I)x. (21) 
n n / 7/ 7/ 

Similarly for the complementary arc J to the closure of I in T we have 

limsupa;V ( ™ ) (7)x > liminf x^ {n) {J)x > sup / f(t)xU^(t)x= / xUn(t)x = x f fi(J)x. (22) 

/ 7j 7j 
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Since fi vanishes at the endpoints of I and fi^ n ' £ Pg (T) we obtain that 

/*W(J)+^(J)dl, /x(7)+/i(J) = l. (23) 
Combining (JUJ) and ((22J| with we conclude that 

x*x = x*lx > Iimsupx^/i^m + m Ct ° > Mmm£xU^ n \l) + fi^ n) (J))x 

n n 

> J x 1 ' dfi(t)x + J x^dfi(t)x = x^fi(I)x + x^fi(J)x — x^ (i(T)x = x^x. 



This is only possible if equalities hold in (|2T|) and in (I22|) . Since the vector x was arbitrary, this implies 
the conclusion of the theorem. □ 

Theorem 7. Let {^")}„>o be a sequence in P f (T) and fi e P £ (T). Then *-lim n ^ (n) = A* 

if and only 

if lim„ fi^ n \l) — fi(I) for any open arc I on T such that fi does not have point masses at the endpoints 
of I. 

Proof. One direction has been already proved in Theorem [5] Suppose now that lim n //")(!) = fi(I) for 
any open arc I on T such that fi does not have point masses at the endpoints of I. Then for every i 6 C f , 
||x|| = 1, the sequence of usual probability measures x^fi^x converges to x^ fix on any interval which 
does not have point masses of fi (and therefore of x^ fix since it is absolutely continuous with respect to 
(j,). Then by Theorem 8.6 of [TT] the sequence x^fi^x converges to x^ fix in the *-weak topology. Now 
the polarization identity implies *— lim„ x^fi^'y — x^ fiy for any pair of vectors {x,y). Setting x := e^, 
y := ej for all pairs we obtain the weak limits for all entries of fi. □ 

Theorem 8. Suppose that v n — h n d0/{2ir) where h n are matrix-valued functions on T. Suppose that 
there is a positive constant C such that 



Then any *-weak limit point of {v n } is an absolutely continuous matrix- valued measure. 

Proof. By norm equivalence (I10[) in A4i, we can replace the operator norm of h n by its Hilbert-Schmidt 
norm. For each matrix entry, the result of this theorem is standard, so it holds for the Hilbert-Schmidt 
(and hence the spectral) norm of the entire matrix as well. □ 

Every fi € P^(T) defines two positive definite quadratic forms on the two-sided module C(T, A4g) over 
A4i of all continuous functions with values in A4i . They correspond to the right and left multiplication 
and are defined as matrix-valued 'inner products' by 



«/,<?»«:= J f(x)Ufi(x)g(x), (24) 
«f,9)h:= [ g(x)dfi(x)f(x)l (25) 



Let V denote the set of all polynomials in z e C with coefficients from Mi- For a nonnegative integer 
n, V n will denote the set of polynomials in V of degree at most n. Note that, to generate an infinite 
sequence of orthogonal polynomials, fi must satisfy © for every nonzero polynomial p. This is equivalent 
to the condition that the non-negative Borel measure 

det (MO, C)) tr(dju) 

has infinite Borel support, see [5l [17] . 



G 



5 Analysis of operator functions 



In this section we list some properties of the logarithm as an operator function. We start with the 
definitions of operator monotone, convex, and concave functions defined on the half real line (0,oo). 
Let T-L be an infinite-dimensional (separable) Hilbert space. Let B + {H) denote the set of all positive 
operators in B(TL). A continuous real function / on (0, oo) is said to be operator monotone (or, more 
precisely, operator monotone increasing) if A ^ B implies f(A) ^ f(B) for A,B £ B + ('H), and operator 
monotone decreasing if — / is operator monotone increasing, i.e., if A ^ B implies f(A) >z f(B), where 
f{A) and f(B) are defined via functional calculus as usual. Also, / is said to be operator convex if 
f(XA + (1 - X)B) ^ Xf(A) + (1 - X)f(B) for all A,Be B+(H) and A £ (0, 1), and operator concave if 
— / is operator convex (see also [3]). 

One should not expect that the operator monotonicity and the operator convexity of / follow from 
the same properties of the scalar function /. For example, a power function t a on (0, oo) is operator 
monotone if and only if a £ [0, 1], operator monotone decreasing if and only if a £ [—1, 0], and operator 
convex if and only if a £ [— 1,0] U [1,2] (see, for instance, [3J Chapter V]). Moreover, the function 
f(t) = exp(i) is neither operator monotone nor operator convex on any (spectral) interval. 

As is known, the operator monotone functions are generated by holomorphic functions that map the 
upper half plane into the upper half plane. Clearly, if one fixes a branch of the logarithm so that it is 
real on (0, oo) then the corresponding holomorophic function maps the upper half plane into the upper 
half plane. 

Proposition 9. The functions logi and —1/t are operator monotone increasing on (0, oo). 
A detailed proof can be found in [5J Section V.4] . 

So, 1/t is operator monotone decreasing on (0, oo). Furthermore, it follows from [31 Exercise V.3.14] 
that the integration of an operator monotone decreasing function gives an operator concave function. 

Proposition 10. The function \ogt is operator concave on (0, oo). 

This statement can also be verified by means of [1] Theorem 3.1]. 

Now, we are in a position to formulate the matrix Jensen inequality for the logarithm. Namely, 
Proposition [TU1 and [B] Theorem 4.2] yield the following statement. 

Proposition 11. Let / : T — > B + (M.g) be a measurable function. Then the following inequality holds: 



Besides monotonicity and convexity, we will also deal with operator continuity. Recall that a function 
/ defined on (0, oo) is operator continuous if the relation \\A n — A\\-h — > implies ||/(A„) — f{A)\\u — > 
for any A, A n £ B + (H). 

Proposition 12. The function \ogt is operator continuous on (0, oo). 

Proof. Since logi can be extended to a holomorphic function on C \ (— oo,0), the statement follows 
directly from the Dunford-Schwarz operator calculus. □ 

6 Matrix orthogonal polynomials on the unit circle 

We begin by recalling some basic facts from [5] for convenience of the reader. Let a £ Pf (T) be a matrix 
probability measure such that det(M(<r, £))tr(der(£)) has infinite Borel support. We define right and left 
monic orthogonal matrix polynomials $^ by applying the Gram-Schmidt procedure in C(T,A4g) 
with respect to the 'inner products' (f24| and (|25[) to the sequence {1, zl, z 2 l, . . .}. In other words, 
is the unique matrix polynomial z n l + lower order terms satisfying the orthogonality conditions 




(26) 




k = 0, 1, . . . , n — 1. 



(27) 



7 



Similarly $^ is the unique matrix polynomial z n l + lower order terms satisfying 

0=((z k l,$Z)) L := J^da(x)(z k l)\ k = 0, 1, . . . , n - 1. (28) 

The normalized orthogonal matrix polynomials are defined by 

^ = ^«=1, ^=k^ and <p* = $*K* (29) 
where the k's are defined according to the normalization conditions 

«V«,P£»JI = *nml, ((<Pn,Vm))L = S nm l, (30) 

along with the following positivity conditions: 

and (^-^^^O. (31) 

Note that the are determined by the normalization condition up to multiplication on the left by 
unitary matrices. It can be shown that these unitaries can always be uniquely chosen so as to satisfy 
(ED, see 0. 
Now define 

Pn '■= K n( K n+l) & nd Pn := ( K n+l) K n ■ 

Being inverses of positives matrices, p 1 ^ and p^ are positive definite as well. In particular, we have that 

*n = (Po ■ ■ ■ Pn-ir 1 and K* = (p*_ 1 "-/Jf)- 1 . (32) 

In the matrix case as well as in the scalar case we have the Szego recursion. Before stating it, 
we recall that, for a matrix polynomial P n of degree n, we define the reversed polynomial P* by l|4|): 
P*(z) = z n P n (l/f)t. 

Theorem 13 ([5j). There is a sequence of contractive matrices a n in A4i such that 

- tivZ+i = «W, (33) 

^n-fn+lPn = ¥#*<4 (34) 

where and p^ are defined as follows 

p^ = {l-a\a n ) l l\ p« = (l-a„at) 1 /2. (35) 
Setting z = in (|33p and using ([29]) . we derive the following formulas for the parameters: 

a n = -K«)- 1 *^ +1 (0)t(^)t = -(OtS^O)^)- 1 . (36) 
Alternatively, one can also set z = in formulas (3.11) of [5]. 

Lemma 14. The left and right monic orthogonal polynomials of a and a are related by 

#£(e* <f)=#£(e-'V)t. (37) 

Proof. For k < n we have, by (|2T|) . 

0=(7 (^c^^m) = | ($«(z, ( r)) t da(z)(z fc l)= | (^(l^d^^l) 

(^(z^)) 1 ^)^!)^ / (z, <f) <£?(*) (^l) 1 , 



which implies ([37) b Y P5]l. □ 
Proposition 15. If {ak}k>o are the parameters of a, then {a\}k>a are the parameters of a. 
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Proof. By (|57|) , the matrix coefficients of the polynomial ^^(e t6 ,a) are the matrices adjoint to the 
coefficients of the polynomial 3?„(e ,<j). In particular, 



*5+i(0 J ff) = #^ +1 (0,cr)t. (38) 



Since k§ = Kq = 1, we see that 



a (W) = -*f (0,a)t = -Sf (0,(7) = a (a)t. 

Suppose that we already proved that afc(<r) — Q!fe(cr)t for k < n. Then, by the induction hypothesis and 
by 

K*{a) = K L n {a)\ ^(a)=^(a)t. (39) 

It follows that 

a n (<?) = -( K «(a))- 1 *^ +1 (0,a)t(^(a))t = -(k^)*)- 1 *^, 

= (-/^(a)t#«. 1 (0 )£ r)t(^( < 7))- 1 ) t - a„(<7)t, 
see ([Ml)- □ 
Corollary 16. The left and right orthogonal polynomials are related by formula 
Proof. We have 

^(e ie ,a) = ^(a)#S(e w ,a) = K^^{e~ i6 , <r)t = ^(e"*^. 

□ 

We next recall the notion of Bernstein-Szego approximation. We begin with a list of properties of 
matrix orthogonal polynomials. 

Theorem 17 (0 Theorem 3.8]). The polynomials (p L , ip R satisfy the following conditions: 

(i) For zeT, all of <p%>*(z), ip^*(z), <p%(z), <fn( z ) are invertible. 

(ii) For zgB, tp%<*(z) and tp%'*(z) are invertible. 

(iii) For any z el, 

^(z)^(z)t = ^(z)V£(z) . (40) 

Given a finite sequence {ctj}^!^ of contractive matrices, we can always use the Szego recursion to 
define the polynomials ip^, ipj for j = 0, 1, . . . , n. Analogously to the scalar case, let us define a measure 
d/j n on T by 

d^(0) = [<p*{e»)<p*{e»)1]- i ¥-. (41) 



In view of (|4T))) . we also see that 



d f i n (e) = [^(e i y^(e ie )}- 1 ^. (42) 



Also, directly from the definition of the right orthogonal polynomials, we have 

d^{ff) = [tf'{e»)W*{e»)]- l g.. (43) 

Z7T 

The measure d[i n in (|43p is called the right Bernstein-Szego approximation to a. The left Bernstein-Szego 
approximation to a is given by 

d[iZ(e) = [rt*(e ie )rt*(e i y]- 1 ^. (44) 
Now we are in a position to formulate the main result of this section. 
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Theorem 18 ( 5 ). The matrix- valued measure dp n is normalized and its right matrix orthogonal 
polynomials for j = 0, . . . , n are {^}™ =0 - The Verblunsky coefficients for dfi n are 

a j (d» n ) = h> J ;\ (45) 
10, j > n + 1. 

Moreover, * — linin._j.oo = da. 

Following [5], we associate the matrix 

A L (az)-( 2{pLrl -^r 1 ^) m 
A [a ' Z) ~ {-zip^a (p*)- 1 ) m 

to a given matrix parameter a. Then 



I = A L (a n . 1 ,z) ■ ■ ■ A^(a ,z) I ± ) . (47) 



Applying the adjoint f to both sides and taking the product over aj for j = 0, . . . , n — 1, we obtain 

VnVn + ^*V* ] * = (1 1) A L {a , „) f • • • A L (a n . 1 ,z) t A L (a n - 1 ,z) ■ • • A L (a 0; z) 
Note that and ([13) imply that the equality 



^V£=^'*V'* (48) 



holds on the circle T, implying that 

Vn* J Vn'* = 5(1 l)^ i (ao,z) t ---A L (a rl _ 1 ,z) t A L (a„_ 1 ,z)---A i (a ^) Q 
Using the fact p^a = a/0 L , the matrix in (|4l))) can be factored as follows: 



To factor the non-diagonal matrix in (|50p . we apply Schur's factorization 

A B\ _ f 1 0\ /A \ /l /F^B 

x c z?J _ ycA- 1 i) \o d-ca- x b)\q i 

with 

A =zl B = -aJ 
C = -za D = 1 

Then 

\p l )- 1 o \ / i oWzi o \ /i -(z)- 1 ^ 



(49) 



«) - ct ,/,-.) f-t to - (-t t') cr „ «>-■ j ■ ■ ™ 



(51) 



^>>*)=^ (pVjU iJU 1-aatAo 1 J' (52) 
Since det p L = det p R , we conclude that 

det A L {a,z) = z l . (53) 

It is easy to check that 

¥>_(*) = (Po) _1 (* " = (* " <4)(Po 

Further, forming the Szego dual, we obtain 

= (1 - z^)^)- 1 , ^f'*(z) = (p^-^l - zoo). 

After pertinent multiplications, this produces 

^f'*(z)Vf *(z) = (1 - zat)(p*)- 2 (l - m ), 

pf'*(*)*>_ = (1 - zao)(p^)- 2 (l - za*). 
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Proposition 19. For every a G Me and z £ T 

(1 - za f )(l - aat)- 1 ^ - za) = [(1 - za^ 1 + (1 - za)- 1 - l] _1 . 
Proof. Consider the matrix polynomial 

p(z) = (1 - za f )(l - aa! + )~ 1 (l - za). (54) 

Since zz = 1 , we have 

1 — aa' = 1 — zzaa' = (1 — za)(l + zoc) + za — za^ . (55) 

Similarly, 

1 — aot^ = 1 — zzaa' = (1 + za)(l — za?) — za + za'. (56) 
The sum of the expressions ([53)1 and (|56|) yields 



2(1 -aa}) = (1 + za)(l -za*) + (1 - za)(l + zo}) 

= (2 - (1 - za))(l - zat) + (1 - zee) (2 - (1 - zat)). 

Let us denote 1 — za by i? for brevity. From ([541 and ([57)1 we obtain 

pO) = B\l-ao*)- l B = 2St [S(2 - St) + (2 - B)St] _1 S 

= 2 [(2 - Bt)B-t +B- 1 (2 - B)]~ l = 2 [2S-t -2 + 2S- 1 ]" 1 
= [(l-zat)- 1 + (l-za)- 1 



(57) 



□ 



7 The Bernstein— Szego approximation 

In this section we obtain a formula for the Bernstein-Szego approximation of a matrix probability 
measure. 

Lemma 20. Let {3 n be the matrix defined by 

i"27T in 

^ = ex p/ log^ie^^ie^)}- 1 )-. (58) 
Then log /3 n is self-adjoint and nonpositive. 

Proof. Since [^''(e'^Vf'*^' 8 )]' 1 is a positive matrix for every its logarithm is self-adjoint as well 
as the integral log f3 n of the logarithm of this matrix. By Proposition [TTJ 



log A, = 



o 



logd^Ce^-fe")]- 1 )^ d log (jT^'Ce^t^^e")]- 1 )^) = log Mn (T) = 



since /Lt„ is normalized: /i„ (T) = 1. □ 

The standard operator calculus and Lemma l2"0l imply that the matrix /3„ = explog/3„ is self-adjoint 
and satisfies 

-< /3„ ■< 1. (59) 

Lemma 21. For /3 n , we have 

n — 1 n — 1 

log det /3„ = tr log /3„ = log det - afcO^.) = ^ tr log - a fc aj^ . (60) 

fc=0 fe=0 
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Proof. Applying elementary transformations and formula (|35p , we obtain 

tr(log/?„) = / trlog([^*(e i y^*(e i8 )]-^= / logdetd^^e^^t^-^)]- 1 )^ 
Jo 27r Jo 27r 

= ^logdet([^(e^)t]-i)g + ^logdetd^^)]- 1 )^ = 2Re£\gdet([^(^)r 1 )g 

n-1 

= 2Relogdet([^*(0)]- 1 ) = 21ogdet (p« • ..p*^) = log TJ det (l - 0*4) 

fc=0 

since the function z M- log det ([fn'* is analytic in the closed unit disc. □ 

Remark. In general, log(Ai?) cannot be written as log A + \ogB if A and B are matrices. So, the 
integral in Lemma [20] cannot be evaluated by the mean value theorem. In other words, the function 
log([(y9^'*(e l£, )^c/?^'*(e ze )] -1 ) is in general not a restriction of a harmonic function to the unit circle. Our 
next lemma addresses the easy case when the logarithm in question does split. 

If {«i, . . . is a commuting family of normal matrices, then the self-adjoint matrix j3 n can be 

evaluated explicitly as follows: 

Lemma 22. Let {a\, . . . , a n _i} be a commuting family of normal matrices. Then 

n-i 

p n = JJ (l - a k a\) . 

Proof. The proof follows the proof of Lemma 1211 since in this case (p^'*(z) is a normal matrix for any 
value of z. □ 



8 The Matrix Szego Theorem 

Definition 23. A matrix probability measure a € P^(T) is said to be a Szego measure if 

/ tr log a' ^- > -oo. (61) 
Jt 27r 

Theorem 24. For any matrix probability measure a € Pe(T) and any neN, 

/ trloga' — < trlog/3„ = log J| det(l - a\a k ). (62) 

Proof. If J T trlog the conclusion of the theorem holds trivially, so assume that a is a Szego 

measure, i.e., the corresponding integral is not — oo. Jensen's matrix inequality from Proposition 1111 
implies 



log 



{^[^\e^a'^*{e iS )^\) g < log ( f ^ 2 [^%e^a'^*(e^}^) . (63) 



Now we replace a' by <r, which is larger in the Loewner ordering, according to (|16[) . Since a is absolutely 
continuous with respect of tr(cr), there exist two disjoint Borel sets E and F and a Borel matrix function 
x i y M(.t) such that 

da = Hxe tr(<7 a ) + Mxf tr(der d + da s ) 

where -E is a Borel support of tr(do~ a ), F is a Borel support of tr(do~d + da s ) and \e, Xf are the indicators 
of E and F correspondingly. Notice that 

/3y 2 [^*(e^)t Mxs tr(^ 
/3y 2 Kf*(e ie )tMxFtr(da d + da s )^'*(e i9 )]^/ 2 = ^[^(e^C^d + ^^(e^R 1 / 2 - 
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Combining with the result of Lemma [TJ we obtain 



2tt' 

By (|63[) and by the operator monotonicity of the logarithm from Lemma [51 we obtain 

| T log(/?y 2 [^.*(e^)taV^*(e ie )]/3y 2 )g^log (jf ^[^(e^td^^e*)]^/ 3 ) 

= log J^^e ie )Ua^>*(e ie )]fa/A = log (ft/ 2 lft/*) = log/3 n , (65) 

in view of the orthonormality of the polynomials <p^. Next, we have 

trlog (^ a [^'(e«)t^(e»)]^/ a ) = logdet (^/ a [^(e»)t^( e »M/ a ) 
= log [det(/3„)det([^*(e ie )t^*(e^)])det( ( T')] = logdet(/3 n )+logdet([^>*(e <e )t^.*( e «)])+logdet(a') 

= trlog/3„ + trlog[^*(e ie )V£>*(e i9 )] + trlog a'. 
Integrating the above equality and taking into account (1551) . and (|63)l . we arrive at 

trlogfl, > jf trlog (^[^(e^^ie^W 2 ) f 

= trlog/3„ + jf trlog ([^(e^t^e*)]) g + ^trloga'g = ^trloga'g . 

It remains to apply Lemma [2~T1 □ 
Corollary 25. If cr is a Szego measure, then 

/ trlog a'— <inftrlog/3„ < - sup log W^ 1 ]] < 0, (66) 

in particular, sup„ H/3" 1 !! < +oo. 

Proof. Since /3 n satishes (JSHJ), all its eigenvalues Afe, 1 < k < £, must lie in the interval (0, 1]. In addition, 



II/?" 1 1| = max A 

" n 11 Kk<l 



-1 
k ■ 



By (|^|) and Lemma 

-oo < / trlogcr'— < logdet (fi n ) = trlog^„ < 0, 



implying that 



log 1 1 1 1 1 =maxlogAfc 1 < VlogA" 1 = trlog/?- 1 < - / trloga'— < +oo. 



□ 
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By compactness of closed balls in the finite-dimensional space Mi, a bounded sequence of matrices 
has a limit point. It follows that if {/3^" 1 }ra>o is uniformly bounded, then any of its limit points /3 -1 in 
Me satisfies 

H/r 1 !! < sup H/3- 1 !! <+°°- (67) 

n 

We denote by C(T, M~^) the set of all continuous matrix functions on T with values in M~\ , by £ 2 (T, M~^) 
the set of all square- integrable matrix functions on T with values in M~^ , and by M(T, M~^) the set of 
all finite Borel measures with values in M~l . 

Let /i be a finite Borel measure with values in M\ . Suppose that, for any open arc / C T whose 
endpoints do not carry point masses of a, the inequality 

holds. Then we write dfi ^ da. 

Theorem 26. Let a e P^(T) satisfy sup n < +00, with (3 n defined as above, and let {/„}„>o be 

a sequence in C(T, M^) such that f n (e l9 ) >~ on T and let 

hf- d i; (68) 

Z7T 

*-lim/ n — ^da; (69) 
\o g p n < [ log/„^. (70) 

Then 

limlog/?„ = / loga'— (71) 



•2tt 



in Mi 



Proof. By ([5^1) and (|67p . the sequence of negative operators log^ ra is uniformly bounded. Suppose that 
log f3 is a limit point of this sequence of matrices in Mi- Then there is an infinite subset A of N such 
that 

limlog/3„ = log/3. (72) 

Let 

log + x = max(logx, 0) , log" x — log + x — \ogx . 
Then log + x < x for every x > 0. The Spectral Theorem applied to a (strictly) positive operator A yields 

log + (A)±A. (73) 

We apply (fT3")l to A :— /n(e* e ) pointwise in 9 and obtain the operator inequality 

log + (/„(e**)) < f n {e w ). (74) 

Integrating (174"]) and taking into account we obtain 

jfbg + (/ n (e*))|Ul. (75) 

Observing that log = log" 1 " — log - and using (fT5|) and (|7TJ|) , we see that 

/ log^e^^l + log/?- 1 . (76) 

Let 

clQ rlf) 
^+:=log + (/„(e ie ))-, ^-:=log-(/ n (e i9 ))-. (77) 
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Since {/3„ 1 }n>o is bounded, (1751) implies that {v n }„>o has a *-weak limit point {/ S M(T, M.\\- 

dO i 
dv~ = *— lim <ii/~ = (i' - )'- — h di/~ for some A C A, (78) 

neA' 27T 

where di'.r is the singular part of dv~ (it may include the discrete part as well), (v~)' = dv~ /(ff )• It 
follows from the inequality (log + x) 2 < x and (|6"8")l that 

(log + (/„( e ^)) 2 g ^ 1 , (79) 

By (fT9")) . the function dv^ /{^) is in the unit ball of L 2 (T, A4i), which is compact in the weak topology 
of i 2 (T, A4g), see Theorem[51 It follows that any *-limit point to of {i^"}n>o in M(T, Mi) is absolutely 
continuous with respect to the Lebesgue measure and, moreover, belongs to L 2 (T,M}). Then there 
exist a subset A C A and some u' in the unit ball of L 2 (T, A4^) such that 

d9 

dv + := *— lim dv^ — 10' — , *— lim dv~ = dv~ , 

neA" 27T riGA" j-gg-j 

dv := — <ii/~ = ho' — )')- dv~ , 

Z7T 

see (|78p . Let 7 be an open arc on T such that its endpoints do not carry point masses of dvj or da s . By 
matrix Jensen's inequality from Proposition !!!! we get 

Applying Helly's Theorem [5] separately to {v+} nG A" and to {i/~} ne A"j we obtain 

neA" \I\ Jj 2tt \I\ 

Applying Helly's Theorem [SJ we derive from (|69[) the inequality 

A substitution of (j8"2")) and (|83p into (j8"Tj) results in the inequality 

TFT ^ lo S 



(here we use the operator continuity of the logarithm, see Proposition fT^j) . It follows from Lebesgue's 
theorem on differentiation and the operator continuity of the logarithm that 

v' < log(cr') (84) 

almost everywhere on T. In view of (|54"f and ([TO)) , we obtain 

log/3 + ^(T) < f dv + v~(T) = Iv'f- < [ loga'^ . (85) 
Jt Jt ^ 7r Jt ^ 7r 

Combining (j62|) with (1851) . we see that 

f dO 

/ tr logo-'— = tr log /3. (86) 
Jt 27t 

and tr^ s (T) = 0, so v~ — by the nonnegativity of the measure . It follows that 

, n /\ ,d0 
log/3 d / logo-'— . 
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Since the traces of the operators on both sides are equal by (|86p , we invoke Lemma [5] and conclude that 

log^= /loga-'^-. 
Jt 27r 

Since log/3 is an arbitrary limit point of {log/3„}„>o, we obtain (|71[) . □ 
Theorem 27. Let a S P^(T) satisfy sup n H^" 1 ]! < +00. Then 

,d6 



lim log /3„ = / logcr 
n 7 T 27r 

Proof. Set 

f n (e ie ) = foWjV^e*)] -1 



in Theorem 1261 Then <J68J) and d69|) follow from Theorem [181 Finally, (J7Q|) follows from J58J). □ 
Theorem 28 ([4, Theorem 18]). For any a G P^(T), 

00 r do 

log TT dct(l - a\a k ) = / trloga'— . (87) 
fe =o ' Jt 271 

Proof. Since det(l — a k otk) < 1 for all k, the sum of the series 

oo 

^]logdet(l - a\a k ) 

k=0 

with negative terms satisfies 



oo „ 

^ logdet(l - a\otk) > / trloga' 
fc=o • /t 



d6 
27 



by Theorem [241 We have two cases. If the series on the left-hand side of (|88|) diverges, then a is not a 
Szego measure and both sides of ([57| equal — oo. If the series on the left-hand side of converges, 
then 

limlogdct(l — a\otk) = limtrlog(l — atafc) = 0. 

k k 

Since the spectral norm || • || is the largest eigenvalue of a positive self-adjoint matrix, it follows that 

lim llatafcH = 0. 

k 

Since — x > log(l — x) for < x < 1, we see that 

-||a|.Q!fc|| > log(l - ||a[.afc||) > trlog(l - a\a k ), 

implying that 

oo 

^2\\a{a k \\ < +oo. 

k=0 

Lemma |2"T1 implies that the HyS^H are bounded. An application of Theorem |2"T1 now completes the 
proof. □ 

Corollary 29 ([H Theorem 19]). A measure a € P^T) is a Szego measure if and only if 

oo 

W a k a k\\ < +°°- 

k=0 
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One direction of this corollary was already proved in Theorem [35] the other direction can be obtained 
analogously to g] 

Corollary 30. Let a be a Szego measure and let {tp n }n>o be the orthogonal polynomials in L 2 (da). 
Then 

* -limdfin = ^-limlogd^^e^^t^^e^)]- 1 ) ^ = log(a')^ (89) 
in the weak topology of M(T, Mi). 
Proof . Apply the proof of Theorem to the measures 

dvt :=d»i :=log+ ([^(^WV*)]" 1 ) 
dv~ :=d»- :=log- ([^-(e^)t^-(^)]- 1 ) g. 
Taking (|55j) into account, we obtain 

i/' = logcr' a.e. on T . (90) 
The substitution of (|90p into the last formula of (150"]) results in 

^ = loga'^ = (c'-(,-)')^. 

Z7T Z7T 

Since to in the proof of Theorem was an arbitrary *-limit point of {"^JneA, this implies that 
*— lim„ e A' dv^ = ui' Since v~ was an arbitrary *-limit point of {^JngAi we conclude that *— lim n d/j, n 
loga'f . "" ^. q 

9 The Helson-Lowdenslager Theorem 

Since 3?^'* is left orthogonal to zl, . . . , z n l (see Lemma 3.2]), it is also left orthogonal to any linear 
combination p of these matrix functions with the coefficients in Me- Take any such combination p. Then 

«$£*-p,<|#*-p» £ = «*^*,*"'*» i +«p,p»z-<(*^,p»z-<(*^,p))i = *£lz+(<p,p>> £ . 
Since every polynomial Q satisfying Q(0) = 1 is of the form Q = <J?,^'* —p we obtain the matrix inequality 

((*n'*>*n'*»£ ^ «Q»Q»i- (91) 

These facts are also derived for the real line in [5] Formula (2.10)]. 

It is therefore natural to call the square root of the positive matrix in the left-hand side of (j9"Tj) the 
left operator distance from 1 to zV n ~i- Consequently, the usual distance in the left Hilbert space is equal 
to 

disti (1, zV n - X f = tr («**•*, . (92) 

One easily verifies (see also [5] Lemma 3.1]) that 



(93) 



= («*)-tl(««)-l = («*)-t(««)-l. 

It follows from ((32]) that 

((^)" t (^)- 1 ) 1/2 = p£-x • • " Po R = (1 - "n-rat _r) 1/2 • • ■ (1 - «o^) 1/2 (94) 
is the left matrix distance from 1 to zV n -\- 
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Corollary 31. The identity polynomial 1 is in the left closure of the sets of matrix polynomials zP n -i 
if and only if 

f , ,d0 
exp / logcr — = 0. 

The distance formula (|94l) is useful if the parameters {ak}k>o of a are known. If this is not the case, 
then one can apply an estimate for from below which was obtained by Helson-Lowdenslager in 

0- 

Now we are in a position to prove the main result of [8] . 
Theorem 32 ([8J). For every a 6 P^T) 

expjf ^trloga'^ = inf J jtx [{A + P)Ua{A + P)} , (95) 

where A runs over all matrices with determinant one, and P over all trigonometric polynomials of the 
form 

P(e ie ) = ^A k e ike . 

k>0 

Proof. Combining Lemma [S] with formula (|94l) . we get 

jrf ^((O-tOtf)- 1 )^ = [det((/ t «)-t(0- 1 )] 1/ " 



exp 1 7 log det ( 1 ~ a i a ] ) J = ex P | ^ tr ( lo S AO | 



Combining this formula with (1921) . we obtain 

inf / i tr + PWcrM + P) f l = inf / ^ tr A(l + zA- 1 P)dcr(l + zA" 1 P) t A f 

= mfitrA(( K «)-t(^)-i)At = exp{i£%rlog([^*( e ^)t^(e^ 

Passing to the limit, we arrive at ([9"5"]). which was initially proved via a different method in [SI Theorem 8]. 

□ 
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